The audiovisual integration of speech and different gesture types

ثبت نشده
چکیده

Gesture and speech are linked in time and meaning. The linkage is limited to the lexico-semantic integration windows of theme-rheme pairs, which by itself already indicates that the temporal synchrony of speech and gesture is more flexible than previously thought. While we know that listeners do perceive both hand and mouth movements (e.g. Gullberg & Kita 2009; Vatakis et al. 2008), audiovisual integration (AVI) research has so far mainly focused on lip-synchrony. The temporal window of successful AVI in the latter is optimal between ~0-250ms of asynchrony (Wassenhove et al. 2007; cf. Massaro et al. 1996). Recent ERP studies show that co-occurring gesture strokes and words are integrated by the listener at least up to an auditory delay of 160ms (Habets et al. 2011; Özyürek et al. 2007). But how large can the asynchrony between gestures and their conceptual affiliates be before it is perceived as unnatural? In the studies by Kirchhof & de Ruiter (2012), subjects were shown sentence-long clips of narrators in frontal view in an online interface. The videos were desynchronized at six levels between-600ms and +600ms, based, among others, on Massaro et al. (1996) with 0ms as the (natural) control. Audio gaps were filled with silences, video gaps with stills. A second and third condition had blurred faces/ a box covering all head features. 618 native speakers of German rated the perceived naturalness of 9327 stimuli on a 4-point Likert scale. In condition 1 all results are around chance (SD=~10) except for +200ms and-600ms (~73%), which is consistent with Was-senhove et al. (2007, v.s.). In the two obscured-head conditions, subjects rated all stimuli as ~68% natural (SD=~5.5). These findings suggest that the AVI window of gesture is rather large. In a follow-up study, 5 stimuli with asynchronies of-600ms, +200ms, and the control in each condition were rated against each other for naturalness. While lip-visibility resulted in a 50/50 preference of 0ms and +200ms, the head-obscured stimuli again had more random ratings across asynchronies, with a lead of +200ms. The second study let subjects re-synchronize a selection of the clips used in study one in a slider interface, and additional stimuli of two physical events were added as controls. In contrast to a judgment task, the subjective window of audiovisual integration, or rather preferred synchrony of bimodal signals was tested here. The results show a classic Gaussian distribution for the physical stimuli, i.e. an acceptable range of …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neural correlates of bimodal speech and gesture comprehension.

The present study examined the neural correlates of speech and hand gesture comprehension in a naturalistic context. Fifteen participants watched audiovisual segments of speech and gesture while event-related potentials (ERPs) were recorded to the speech. Gesture influenced the ERPs to the speech. Specifically, there was a right-lateralized N400 effect-reflecting semantic integration-when gestu...

متن کامل

Neural correlates of the processing of co-speech gestures

In communicative situations, speech is often accompanied by gestures. For example, speakers tend to illustrate certain contents of speech by means of iconic gestures which are hand movements that bear a formal relationship to the contents of speech. The meaning of an iconic gesture is determined both by its form as well as the speech context in which it is performed. Thus, gesture and speech in...

متن کامل

A measure for assessing the effects of audiovisual speech integration.

We propose a measure of audiovisual speech integration that takes into account accuracy and response times. This measure should prove beneficial for researchers investigating multisensory speech recognition, since it relates to normal-hearing and aging populations. As an example, age-related sensory decline influences both the rate at which one processes information and the ability to utilize c...

متن کامل

Electrophysiological evidence for differences between fusion and combination illusions in audiovisual speech perception

Incongruent audiovisual speech stimuli can lead to perceptual illusions such as fusions or combinations. Here, we investigated the underlying audiovisual integration process by measuring ERPs. We observed that visual speech-induced suppression of P2 amplitude (which is generally taken as a measure of audiovisual integration) for fusions was similar to suppression obtained with fully congruent s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012